Getting Actual Distributed Query Plan

Warning, this should not be done in production as this can take a lot of resources and cause sporadic behavior when executing in SSMS. Also consider executing in sqlcmd. Please also note this is the actual execution plan. All deletes, inserts, etc., will be processed since the query is actually running.

Using SSMS:

Step1:

Disable result set caching to ensure query is not returned from the cache.

Most importantly, set query diagnostics on.

Please note that this is the query’s actual execution plan, not estimated. Therefore, the query will actually run on the system and provide the actual plan. This is a very resource intensive process, please consider only running in Dev environments where the instance can be paused and restarted if sessions are not able to be killed and rolled back quick enough.

SET RESULT_SET_CACHING OFF
set query_diagnostics on   --Setting that allows execution plan
select * from synapseworkflow1 syn1
join synapseworkflow2 syn2
on syn1.ID= syn2.FID
and where syn2.JobTitle =’dataengineer’;

 

Step2:

The above will return the below output. This will include the results and actual compute node text and compute node plan. Remember, Synapse breaks down queries into smaller queries in 60 data distributions.

There will be two sets of outputs total of 120.

Graphical user interface, application, table, Excel

Description automatically generated

Step3:

Copy the data in one of the 60 showplan column and save it as a .sqlplan and open the plan in SSMS. You will see the actual distributed compute node plan similar to the below.

Graphical user interface, application

Description automatically generated

 

Using the safer sqlcmd:

Step1:

Ensure that the latest sqlcmd is installed.

Use the below command replacing the <synpasenamehere> with you Synapse SQL Pool name, username, database name and location of file and location to save file. Please note that all the parameters in the sqlcmd command will be required to successfully execute the below.

sqlcmd -S <synpasenamehere>.database.windows.net -U <usernamehere> -d <database_namehere> -I -i "C:\Users\victor\Documents\pathnamerhere\inputquerydiag.sql" -o "C:\Users\ victor\Documents\ pathnamerhere \synapseoutputfile.txt" -y0

Step2:

Notice the query text file that is being invoke looks similar to the above query:

Graphical user interface, text, application

Description automatically generated

Step3:

 

Execute the sqlcmd command and once completed there will be an output file.

Open the file and save portions of the distributed query as a .slqplan file and review the actual compute node distributed plan.